Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling

نویسندگان

  • J. Ignacio Toledo
  • Sebastian Sudholt
  • Alicia Fornés
  • Jordi Cucurull-Juan
  • Gernot A. Fink
  • Josep Lladós
چکیده

The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images

Convolutional neural network is one of the effective methods for classifying images that performs learning using convolutional, pooling and fully-connected layers. All kinds of noise disrupt the operation of this network. Noise images reduce classification accuracy and increase convolutional neural network training time. Noise is an unwanted signal that destroys the original signal. Noise chang...

متن کامل

Temporal Pyramid Pooling Based Convolutional Neural Networks for Action Recognition

Encouraged by the success of Convolutional Neural Networks (CNNs) in image classification, recently much effort is spent on applying CNNs to video based action recognition problems. One challenge is that video contains a varying number of frames which is incompatible to the standard input format of CNNs. Existing methods handle this issue either by directly sampling a fixed number of frames or ...

متن کامل

A Sequence Labeling Convolutional Network and Its Application to Handwritten String Recognition

Handwritten string recognition has been struggling with connected patterns fiercely. Segmentationfree and over-segmentation frameworks are commonly applied to deal with this issue. For the past years, RNN combining with CTC has occupied the domain of segmentation-free handwritten string recognition, while CNN is just employed as a single character recognizer in the over-segmentation framework. ...

متن کامل

Adaptive Deep Pyramid Matching for Remote Sensing Scene Classification

Convolutional neural networks (CNNs) have attracted increasing attention in the remote sensing community. Most CNNs only take the last fully-connected layers as features for the classification of remotely sensed images, discarding the other convolutional layer features which may also be helpful for classification purposes. In this paper, we propose a new adaptive deep pyramid matching (ADPM) mo...

متن کامل

Static Image Classification based on ScSPM and LBP histogram Fourier (LBP-HF) Features

In the recent digital age, support vector machines (SVMs) that use a spatial pyramid matching (SPM) kernel have been around the globe for image classification. Although this is popular, there exists many problems in its use; for examples, nonlinear SVMs have a high complexity in training and testing. Applying the algorithms to big datasets, which holds many images, greater than a thousand is a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016